Motivated by a real-life problem of sharing social network data that containsensitive personal information, we propose a novel approach to release andanalyze synthetic graphs in order to protect privacy of individualrelationships captured by the social network while maintaining the validity ofstatistical results. A case study using a version of the Enron e-mail corpusdataset demonstrates the application and usefulness of the proposed techniquesin solving the challenging problem of maintaining privacy \emph{and} supportingopen access to network data to ensure reproducibility of existing studies anddiscovering new scientific insights that can be obtained by analyzing suchdata. We use a simple yet effective randomized response mechanism to generatesynthetic networks under $\epsilon$-edge differential privacy, and then uselikelihood based inference for missing data and Markov chain Monte Carlotechniques to fit exponential-family random graph models to the generatedsynthetic networks.
展开▼